Skip to content

Add new compactor metrics for tenant-prefixed object store#322

Open
willh-db wants to merge 1 commit intodatabricks:db_mainfrom
willh-db:mt-compactor-metrics
Open

Add new compactor metrics for tenant-prefixed object store#322
willh-db wants to merge 1 commit intodatabricks:db_mainfrom
willh-db:mt-compactor-metrics

Conversation

@willh-db
Copy link
Copy Markdown
Collaborator

Changes

  • Add thanos_compact_tenant_assigned{tenant} gauge to expose which tenants are assigned to each compactor instance, enabling verification of tenant partitioning across replicas
  • Add thanos_compact_tenant_iterations_total{tenant} counter to track successful compaction iterations per tenant, enabling verification that compaction is completing end-to-end for every tenant

The existing compactor metrics are either global (thanos_compact_iterations_total) or only carry a resolution label (thanos_compact_group_compaction_runs_*). In multitenant mode, there is no way to confirm via metrics that:

  1. A specific tenant is assigned to the expected compactor replica
  2. Compaction is actually completing for each individual tenant

These two new metrics close that gap with minimal overhead.

@willh-db willh-db requested review from jnyi and yuchen-db March 31, 2026 23:17
Comment thread cmd/thanos/compact.go
runWebServer(g, ctx, logger, cancel, reg, &conf, component, tracer, progressRegistry, globalBaseMetaFetcher, api, srv)

for _, tenantPrefix := range tenantPrefixes {
compactMetrics.tenantAssigned.WithLabelValues(tenantPrefix).Set(1)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remind me what is tenantPrefixes?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was v1/raw/<tenant> but now it's just <tenant>

Comment thread cmd/thanos/compact.go
Help: "Total number of compaction iterations completed successfully per tenant.",
}, []string{"tenant"})
m.tenantAssigned = promauto.With(reg).NewGaugeVec(prometheus.GaugeOpts{
Name: "thanos_compact_tenant_assigned",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also have this metric thanos_blocks_meta_assigned for tenant view (which tenant got assigned to which compactor` could we reuse that?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocks_meta is more like "how many blocks seen per tenant" whereas tenant_iterations is "how many times has compaction run per tenant." I think this will be valuable for ensuring liveness and being able to alert on compaction stalls.

tenant_assigned is useful to check on startup but in steady-state it is redundant with the other two. Let me know if you think it's worth keeping

Copy link
Copy Markdown
Collaborator

@jnyi jnyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might reuse existing metric thanos_blocks_meta_assigned

@willh-db willh-db requested a review from jnyi April 1, 2026 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants